-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Security: fix joining cluster with production license #31341
Conversation
The changes made to disable security for trial licenses unless security is explicitly enabled caused issues when a 6.3 node attempts to join a cluster that already has a production license installed. The new node starts off with a trial licenses and `xpack.security.enabled` is not set for the node, which causes the security code to skip attaching the user to the request. The existing cluster has security enabled and the lack of a user attached to the requests causes the request to be rejected. This commit changes the security code to check if the state has been recovered yet when making the decision on whether or not to attach a user. If the state has not yet been recovered, the code will attach the user to the request in case security is enabled on the cluster being joined. Closes elastic#31332
Pinging @elastic/es-security |
@@ -254,7 +254,8 @@ | |||
|
|||
public XPackLicenseState(Settings settings) { | |||
this.isSecurityEnabled = XPackSettings.SECURITY_ENABLED.get(settings); | |||
this.isSecurityExplicitlyEnabled = settings.hasValue(XPackSettings.SECURITY_ENABLED.getKey()) && isSecurityEnabled; | |||
this.isSecurityExplicitlyEnabled = (isSecurityEnabled && settings.hasValue(XPackSettings.SECURITY_ENABLED.getKey())) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a suggestion from @bleskes that I went ahead and adopted as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the reasoning behind doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for cases where a user with a production license on 6.0 to 6.2.x upgrades to 6.3.1+. We require TLS for production mode, so if TLS is enabled we can take that as security being explicitly enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about the choice to always send auth if the state is not recovered (irrespective of the node settings)
@@ -98,7 +104,9 @@ public AsyncSender interceptSender(AsyncSender sender) { | |||
@Override | |||
public <T extends TransportResponse> void sendRequest(Transport.Connection connection, String action, TransportRequest request, | |||
TransportRequestOptions options, TransportResponseHandler<T> handler) { | |||
if (licenseState.isSecurityEnabled() && licenseState.isAuthAllowed()) { | |||
final boolean stateNotRecovered = isStateNotRecovered; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purposes of this assignment (from volatile to local) initially confused me, and I'm worried someone might think it is redundant and remove it in the future. Can we add a comment to try and discourage that from happening?
@@ -98,7 +104,9 @@ public AsyncSender interceptSender(AsyncSender sender) { | |||
@Override | |||
public <T extends TransportResponse> void sendRequest(Transport.Connection connection, String action, TransportRequest request, | |||
TransportRequestOptions options, TransportResponseHandler<T> handler) { | |||
if (licenseState.isSecurityEnabled() && licenseState.isAuthAllowed()) { | |||
final boolean stateNotRecovered = isStateNotRecovered; | |||
final boolean sendWithAuth = (licenseState.isSecurityEnabled() && licenseState.isAuthAllowed()) || stateNotRecovered; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that if the state is not recovered, then we're going to sendWithAuth
even if the local node explitily disables security.
If so, then why have the check at all? If it's safe to send the auth context it even if security is disabled, isn't it simpler to just send it for every outgoing request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thought as @tvernum here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interceptor is not registered at all if security is disabled.
elasticsearch/x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/Security.java
Lines 808 to 829 in 3b70e94
@Override | |
public List<TransportInterceptor> getTransportInterceptors(NamedWriteableRegistry namedWriteableRegistry, ThreadContext threadContext) { | |
if (transportClientMode || enabled == false) { // don't register anything if we are not enabled | |
// interceptors are not installed if we are running on the transport client | |
return Collections.emptyList(); | |
} | |
return Collections.singletonList(new TransportInterceptor() { | |
@Override | |
public <T extends TransportRequest> TransportRequestHandler<T> interceptHandler(String action, String executor, | |
boolean forceExecution, | |
TransportRequestHandler<T> actualHandler) { | |
assert securityInterceptor.get() != null; | |
return securityInterceptor.get().interceptHandler(action, executor, forceExecution, actualHandler); | |
} | |
@Override | |
public AsyncSender interceptSender(AsyncSender sender) { | |
assert securityInterceptor.get() != null; | |
return securityInterceptor.get().interceptSender(sender); | |
} | |
}); | |
} |
This change limits to only sending auth when security could possibly be enabled. Once we've recovered the state then the check is based upon the license. By having the check we maintain the ability to assert that all outgoing requests have authentication after the state has been recovered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The changes made to disable security for trial licenses unless security is explicitly enabled caused issues when a 6.3 node attempts to join a cluster that already has a production license installed. The new node starts off with a trial license and `xpack.security.enabled` is not set for the node, which causes the security code to skip attaching the user to the request. The existing cluster has security enabled and the lack of a user attached to the requests causes the request to be rejected. This commit changes the security code to check if the state has been recovered yet when making the decision on whether or not to attach a user. If the state has not yet been recovered, the code will attach the user to the request in case security is enabled on the cluster being joined. Closes #31332
The changes made to disable security for trial licenses unless security is explicitly enabled caused issues when a 6.3 node attempts to join a cluster that already has a production license installed. The new node starts off with a trial license and `xpack.security.enabled` is not set for the node, which causes the security code to skip attaching the user to the request. The existing cluster has security enabled and the lack of a user attached to the requests causes the request to be rejected. This commit changes the security code to check if the state has been recovered yet when making the decision on whether or not to attach a user. If the state has not yet been recovered, the code will attach the user to the request in case security is enabled on the cluster being joined. Closes #31332
* 6.x: [DOCS] Omit shard failures assertion for incompatible responses (#31430) [DOCS] Move licensing APIs to docs (#31445) backport of: add is-write-index flag to aliases (#30942) (#31412) backport of: Add rollover-creation-date setting to rolled over index (#31144) (#31413) [Docs] Extend Homebrew installation instructions (#28902) [Docs] Mention ip_range datatypes on ip type page (#31416) Multiplexing token filter (#31208) Fix use of time zone in date_histogram rewrite (#31407) Revert "Mute DefaultShardsIT#testDefaultShards test" [DOCS] Fixes code snippet testing for machine learning (#31189) Security: fix joining cluster with production license (#31341) [DOCS] Updated version in Info API example [DOCS] Moves the info API to docs (#31121) Revert "Increasing skip version for failing test on 6.x" Preserve response headers on cluster update task (#31421) [DOCS] Add code snippet testing for more ML APIs (#31404) Docs: Advice for reindexing many indices (#31279)
* master: [DOCS] Omit shard failures assertion for incompatible responses (#31430) [DOCS] Move licensing APIs to docs (#31445) Add Delete Snapshot High Level REST API Remove QueryCachingPolicy#ALWAYS_CACHE (#31451) [Docs] Extend Homebrew installation instructions (#28902) Choose JVM options ergonomically [Docs] Mention ip_range datatypes on ip type page (#31416) Multiplexing token filter (#31208) Fix use of time zone in date_histogram rewrite (#31407) Core: Remove index name resolver from base TransportAction (#31002) [DOCS] Fixes code snippet testing for machine learning (#31189) [DOCS] Removed and params from MLT. Closes #28128 (#31370) Security: fix joining cluster with production license (#31341) Unify http channels and exception handling (#31379) [DOCS] Moves the info API to docs (#31121) Preserve response headers on cluster update task (#31421) [DOCS] Add code snippet testing for more ML APIs (#31404) Do not preallocate bytes for channel buffer (#31400) Docs: Advice for reindexing many indices (#31279) Mute HttpExporterTests#testHttpExporterShutdown test Tracked by #31433 Docs: Add note about removing prepareExecute from the java client (#31401) Make release notes ignore the `>test-failure` label. (#31309)
The changes made to disable security for trial licenses unless security
is explicitly enabled caused issues when a 6.3 node attempts to join a
cluster that already has a production license installed. The new node
starts off with a trial licenses and
xpack.security.enabled
is notset for the node, which causes the security code to skip attaching the
user to the request. The existing cluster has security enabled and the
lack of a user attached to the requests causes the request to be
rejected.
This commit changes the security code to check if the state has been
recovered yet when making the decision on whether or not to attach a
user. If the state has not yet been recovered, the code will attach
the user to the request in case security is enabled on the cluster
being joined.
Closes #31332